Overview

Brought to you by YData

Dataset statistics

Number of variables41
Number of observations2891738
Missing cells23048768
Missing cells (%)19.4%
Total size in memory904.6 MiB
Average record size in memory328.0 B

Variable types

Text41

Dataset

DescriptionMeise Botanic Garden Herbarium (BR) 0007170-250310093411724
URLhttps://doi.org/10.15468/dl.mrp6vb

Alerts

license has constant value "http://creativecommons.org/licenses/by/4.0/" Constant
rightsHolder has constant value "Meise Botanic Garden" Constant
datasetName has constant value "Meise Botanic Garden Herbarium" Constant
nomenclaturalCode has constant value "ICBN" Constant
recordNumber has 47728 (1.7%) missing values Missing
recordedBy has 31431 (1.1%) missing values Missing
recordedByID has 1180976 (40.8%) missing values Missing
eventDate has 660123 (22.8%) missing values Missing
year has 665278 (23.0%) missing values Missing
month has 817758 (28.3%) missing values Missing
day has 1213493 (42.0%) missing values Missing
verbatimEventDate has 1240382 (42.9%) missing values Missing
habitat has 2368064 (81.9%) missing values Missing
country has 35559 (1.2%) missing values Missing
countryCode has 39980 (1.4%) missing values Missing
locality has 229370 (7.9%) missing values Missing
locationRemarks has 1469995 (50.8%) missing values Missing
decimalLatitude has 2194373 (75.9%) missing values Missing
decimalLongitude has 2193931 (75.9%) missing values Missing
coordinateUncertaintyInMeters has 2619486 (90.6%) missing values Missing
typeStatus has 2827400 (97.8%) missing values Missing
acceptedNameUsage has 2680777 (92.7%) missing values Missing
kingdom has 29119 (1.0%) missing values Missing
phylum has 29240 (1.0%) missing values Missing
class has 29606 (1.0%) missing values Missing
order has 29654 (1.0%) missing values Missing
genus has 58006 (2.0%) missing values Missing
taxonomicStatus has 320988 (11.1%) missing values Missing
gbifID has unique values Unique
occurrenceID has unique values Unique
catalogNumber has unique values Unique

Reproduction

Analysis started2025-03-14 17:55:38.005504
Analysis finished2025-03-14 17:56:51.937886
Duration1 minute and 13.93 seconds
Software versionydata-profiling vv4.12.2
Download configurationconfig.json

Variables

gbifID
Text

Unique 

Distinct2891738
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size22.1 MiB
2025-03-14T13:56:53.320885image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length10
Median length10
Mean length10
Min length10

Characters and Unicode

Total characters28917380
Distinct characters10
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2891738 ?
Unique (%)100.0%

Sample

1st row4072468249
2nd row1840134947
3rd row4073209772
4th row1840134952
5th row4072468250
ValueCountFrequency (%)
4072468249 1
 
< 0.1%
4073209866 1
 
< 0.1%
1840135015 1
 
< 0.1%
1840135044 1
 
< 0.1%
4073209775 1
 
< 0.1%
4073209772 1
 
< 0.1%
1840134952 1
 
< 0.1%
4072468250 1
 
< 0.1%
1840134965 1
 
< 0.1%
4073209774 1
 
< 0.1%
Other values (2891728) 2891728
> 99.9%
2025-03-14T13:56:54.736969image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
4 3940297
13.6%
0 3546184
12.3%
8 3392700
11.7%
1 3214993
11.1%
3 2925536
10.1%
2 2888882
10.0%
7 2862308
9.9%
9 2331150
8.1%
6 1931389
6.7%
5 1883941
6.5%

Most occurring categories

ValueCountFrequency (%)
(unknown) 28917380
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
4 3940297
13.6%
0 3546184
12.3%
8 3392700
11.7%
1 3214993
11.1%
3 2925536
10.1%
2 2888882
10.0%
7 2862308
9.9%
9 2331150
8.1%
6 1931389
6.7%
5 1883941
6.5%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 28917380
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
4 3940297
13.6%
0 3546184
12.3%
8 3392700
11.7%
1 3214993
11.1%
3 2925536
10.1%
2 2888882
10.0%
7 2862308
9.9%
9 2331150
8.1%
6 1931389
6.7%
5 1883941
6.5%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 28917380
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
4 3940297
13.6%
0 3546184
12.3%
8 3392700
11.7%
1 3214993
11.1%
3 2925536
10.1%
2 2888882
10.0%
7 2862308
9.9%
9 2331150
8.1%
6 1931389
6.7%
5 1883941
6.5%

license
Text

Constant 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size22.1 MiB
2025-03-14T13:56:54.785661image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length43
Median length43
Mean length43
Min length43

Characters and Unicode

Total characters124344734
Distinct characters22
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowhttp://creativecommons.org/licenses/by/4.0/
2nd rowhttp://creativecommons.org/licenses/by/4.0/
3rd rowhttp://creativecommons.org/licenses/by/4.0/
4th rowhttp://creativecommons.org/licenses/by/4.0/
5th rowhttp://creativecommons.org/licenses/by/4.0/
ValueCountFrequency (%)
http://creativecommons.org/licenses/by/4.0 2891738
100.0%
2025-03-14T13:56:54.866104image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
/ 17350428
14.0%
e 11566952
 
9.3%
o 8675214
 
7.0%
c 8675214
 
7.0%
t 8675214
 
7.0%
s 8675214
 
7.0%
r 5783476
 
4.7%
i 5783476
 
4.7%
m 5783476
 
4.7%
n 5783476
 
4.7%
Other values (12) 37592594
30.2%

Most occurring categories

ValueCountFrequency (%)
(unknown) 124344734
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
/ 17350428
14.0%
e 11566952
 
9.3%
o 8675214
 
7.0%
c 8675214
 
7.0%
t 8675214
 
7.0%
s 8675214
 
7.0%
r 5783476
 
4.7%
i 5783476
 
4.7%
m 5783476
 
4.7%
n 5783476
 
4.7%
Other values (12) 37592594
30.2%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 124344734
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
/ 17350428
14.0%
e 11566952
 
9.3%
o 8675214
 
7.0%
c 8675214
 
7.0%
t 8675214
 
7.0%
s 8675214
 
7.0%
r 5783476
 
4.7%
i 5783476
 
4.7%
m 5783476
 
4.7%
n 5783476
 
4.7%
Other values (12) 37592594
30.2%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 124344734
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
/ 17350428
14.0%
e 11566952
 
9.3%
o 8675214
 
7.0%
c 8675214
 
7.0%
t 8675214
 
7.0%
s 8675214
 
7.0%
r 5783476
 
4.7%
i 5783476
 
4.7%
m 5783476
 
4.7%
n 5783476
 
4.7%
Other values (12) 37592594
30.2%
Distinct4986
Distinct (%)0.2%
Missing4
Missing (%)< 0.1%
Memory size22.1 MiB
2025-03-14T13:56:54.896413image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length10
Median length10
Mean length10
Min length10

Characters and Unicode

Total characters28917340
Distinct characters11
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique69 ?
Unique (%)< 0.1%

Sample

1st row2022-12-25
2nd row2017-12-31
3rd row2022-12-25
4th row2017-12-31
5th row2022-12-25
ValueCountFrequency (%)
2022-12-25 918568
31.8%
2017-12-31 269133
 
9.3%
2018-03-16 191006
 
6.6%
2018-10-05 129890
 
4.5%
2019-01-25 127932
 
4.4%
2006-08-18 127431
 
4.4%
2021-05-05 95916
 
3.3%
2024-03-07 54249
 
1.9%
2017-11-02 40176
 
1.4%
2023-08-23 19973
 
0.7%
Other values (4976) 917460
31.7%
2025-03-14T13:56:54.979496image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2 8048644
27.8%
- 5783468
20.0%
0 5468173
18.9%
1 4379135
15.1%
5 1596760
 
5.5%
3 894686
 
3.1%
8 870353
 
3.0%
7 620747
 
2.1%
6 617531
 
2.1%
9 352694
 
1.2%

Most occurring categories

ValueCountFrequency (%)
(unknown) 28917340
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
2 8048644
27.8%
- 5783468
20.0%
0 5468173
18.9%
1 4379135
15.1%
5 1596760
 
5.5%
3 894686
 
3.1%
8 870353
 
3.0%
7 620747
 
2.1%
6 617531
 
2.1%
9 352694
 
1.2%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 28917340
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
2 8048644
27.8%
- 5783468
20.0%
0 5468173
18.9%
1 4379135
15.1%
5 1596760
 
5.5%
3 894686
 
3.1%
8 870353
 
3.0%
7 620747
 
2.1%
6 617531
 
2.1%
9 352694
 
1.2%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 28917340
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
2 8048644
27.8%
- 5783468
20.0%
0 5468173
18.9%
1 4379135
15.1%
5 1596760
 
5.5%
3 894686
 
3.1%
8 870353
 
3.0%
7 620747
 
2.1%
6 617531
 
2.1%
9 352694
 
1.2%

rightsHolder
Text

Constant 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size22.1 MiB
2025-03-14T13:56:55.008142image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length20
Median length20
Mean length20
Min length20

Characters and Unicode

Total characters57834760
Distinct characters14
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowMeise Botanic Garden
2nd rowMeise Botanic Garden
3rd rowMeise Botanic Garden
4th rowMeise Botanic Garden
5th rowMeise Botanic Garden
ValueCountFrequency (%)
meise 2891738
33.3%
botanic 2891738
33.3%
garden 2891738
33.3%
2025-03-14T13:56:55.087075image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
e 8675214
15.0%
i 5783476
10.0%
5783476
10.0%
a 5783476
10.0%
n 5783476
10.0%
M 2891738
 
5.0%
s 2891738
 
5.0%
B 2891738
 
5.0%
o 2891738
 
5.0%
t 2891738
 
5.0%
Other values (4) 11566952
20.0%

Most occurring categories

ValueCountFrequency (%)
(unknown) 57834760
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
e 8675214
15.0%
i 5783476
10.0%
5783476
10.0%
a 5783476
10.0%
n 5783476
10.0%
M 2891738
 
5.0%
s 2891738
 
5.0%
B 2891738
 
5.0%
o 2891738
 
5.0%
t 2891738
 
5.0%
Other values (4) 11566952
20.0%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 57834760
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
e 8675214
15.0%
i 5783476
10.0%
5783476
10.0%
a 5783476
10.0%
n 5783476
10.0%
M 2891738
 
5.0%
s 2891738
 
5.0%
B 2891738
 
5.0%
o 2891738
 
5.0%
t 2891738
 
5.0%
Other values (4) 11566952
20.0%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 57834760
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
e 8675214
15.0%
i 5783476
10.0%
5783476
10.0%
a 5783476
10.0%
n 5783476
10.0%
M 2891738
 
5.0%
s 2891738
 
5.0%
B 2891738
 
5.0%
o 2891738
 
5.0%
t 2891738
 
5.0%
Other values (4) 11566952
20.0%
Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size22.1 MiB
2025-03-14T13:56:55.116754image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length47
Median length47
Mean length46.94599926
Min length25

Characters and Unicode

Total characters135755530
Distinct characters25
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowhttp://biocol.org/urn:lsid:biocol.org:col:15605
2nd rowhttp://biocol.org/urn:lsid:biocol.org:col:15605
3rd rowhttp://biocol.org/urn:lsid:biocol.org:col:15605
4th rowhttp://biocol.org/urn:lsid:biocol.org:col:15605
5th rowhttp://biocol.org/urn:lsid:biocol.org:col:15605
ValueCountFrequency (%)
http://biocol.org/urn:lsid:biocol.org:col:15605 2884640
99.8%
https://ror.org/00cv9y106 6674
 
0.2%
https://ror.org/01r9htc13 424
 
< 0.1%
2025-03-14T13:56:55.196844image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
o 20206676
14.9%
: 14430298
 
10.6%
l 11538560
 
8.5%
r 8675638
 
6.4%
/ 8675214
 
6.4%
c 8661018
 
6.4%
i 8653920
 
6.4%
t 5783900
 
4.3%
g 5776378
 
4.3%
. 5776378
 
4.3%
Other values (15) 37577550
27.7%

Most occurring categories

ValueCountFrequency (%)
(unknown) 135755530
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
o 20206676
14.9%
: 14430298
 
10.6%
l 11538560
 
8.5%
r 8675638
 
6.4%
/ 8675214
 
6.4%
c 8661018
 
6.4%
i 8653920
 
6.4%
t 5783900
 
4.3%
g 5776378
 
4.3%
. 5776378
 
4.3%
Other values (15) 37577550
27.7%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 135755530
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
o 20206676
14.9%
: 14430298
 
10.6%
l 11538560
 
8.5%
r 8675638
 
6.4%
/ 8675214
 
6.4%
c 8661018
 
6.4%
i 8653920
 
6.4%
t 5783900
 
4.3%
g 5776378
 
4.3%
. 5776378
 
4.3%
Other values (15) 37577550
27.7%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 135755530
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
o 20206676
14.9%
: 14430298
 
10.6%
l 11538560
 
8.5%
r 8675638
 
6.4%
/ 8675214
 
6.4%
c 8661018
 
6.4%
i 8653920
 
6.4%
t 5783900
 
4.3%
g 5776378
 
4.3%
. 5776378
 
4.3%
Other values (15) 37577550
27.7%
Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size22.1 MiB
2025-03-14T13:56:55.224620image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length18
Median length18
Mean length18
Min length18

Characters and Unicode

Total characters52051284
Distinct characters16
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowgbif:ih:irn:124997
2nd rowgbif:ih:irn:124997
3rd rowgbif:ih:irn:124997
4th rowgbif:ih:irn:124997
5th rowgbif:ih:irn:124997
ValueCountFrequency (%)
gbif:ih:irn:124997 2884640
99.8%
gbif:ih:irn:124613 6674
 
0.2%
gbif:ih:irn:126554 424
 
< 0.1%
2025-03-14T13:56:55.303058image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
i 8675214
16.7%
: 8675214
16.7%
9 5769280
11.1%
1 2898412
 
5.6%
g 2891738
 
5.6%
b 2891738
 
5.6%
f 2891738
 
5.6%
h 2891738
 
5.6%
r 2891738
 
5.6%
n 2891738
 
5.6%
Other values (6) 8682736
16.7%

Most occurring categories

ValueCountFrequency (%)
(unknown) 52051284
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
i 8675214
16.7%
: 8675214
16.7%
9 5769280
11.1%
1 2898412
 
5.6%
g 2891738
 
5.6%
b 2891738
 
5.6%
f 2891738
 
5.6%
h 2891738
 
5.6%
r 2891738
 
5.6%
n 2891738
 
5.6%
Other values (6) 8682736
16.7%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 52051284
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
i 8675214
16.7%
: 8675214
16.7%
9 5769280
11.1%
1 2898412
 
5.6%
g 2891738
 
5.6%
b 2891738
 
5.6%
f 2891738
 
5.6%
h 2891738
 
5.6%
r 2891738
 
5.6%
n 2891738
 
5.6%
Other values (6) 8682736
16.7%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 52051284
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
i 8675214
16.7%
: 8675214
16.7%
9 5769280
11.1%
1 2898412
 
5.6%
g 2891738
 
5.6%
b 2891738
 
5.6%
f 2891738
 
5.6%
h 2891738
 
5.6%
r 2891738
 
5.6%
n 2891738
 
5.6%
Other values (6) 8682736
16.7%
Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size22.1 MiB
2025-03-14T13:56:55.331486image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length7
Median length7
Mean length6.994797592
Min length3

Characters and Unicode

Total characters20227122
Distinct characters11
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowMeiseBG
2nd rowMeiseBG
3rd rowMeiseBG
4th rowMeiseBG
5th rowMeiseBG
ValueCountFrequency (%)
meisebg 2884640
99.8%
ugent 6674
 
0.2%
ulb 424
 
< 0.1%
2025-03-14T13:56:55.420257image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
e 5769280
28.5%
G 2891314
14.3%
B 2885064
14.3%
M 2884640
14.3%
i 2884640
14.3%
s 2884640
14.3%
U 7098
 
< 0.1%
E 6674
 
< 0.1%
N 6674
 
< 0.1%
T 6674
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
(unknown) 20227122
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
e 5769280
28.5%
G 2891314
14.3%
B 2885064
14.3%
M 2884640
14.3%
i 2884640
14.3%
s 2884640
14.3%
U 7098
 
< 0.1%
E 6674
 
< 0.1%
N 6674
 
< 0.1%
T 6674
 
< 0.1%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 20227122
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
e 5769280
28.5%
G 2891314
14.3%
B 2885064
14.3%
M 2884640
14.3%
i 2884640
14.3%
s 2884640
14.3%
U 7098
 
< 0.1%
E 6674
 
< 0.1%
N 6674
 
< 0.1%
T 6674
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 20227122
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
e 5769280
28.5%
G 2891314
14.3%
B 2885064
14.3%
M 2884640
14.3%
i 2884640
14.3%
s 2884640
14.3%
U 7098
 
< 0.1%
E 6674
 
< 0.1%
N 6674
 
< 0.1%
T 6674
 
< 0.1%
Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size22.1 MiB
2025-03-14T13:56:55.446526image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length4
Median length2
Mean length2.004910196
Min length2

Characters and Unicode

Total characters5797675
Distinct characters11
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowBR
2nd rowBR
3rd rowBR
4th rowBR
5th rowBR
ValueCountFrequency (%)
br 2884637
99.8%
gent 6674
 
0.2%
brlu 424
 
< 0.1%
awh 3
 
< 0.1%
2025-03-14T13:56:55.538852image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
B 2885061
49.8%
R 2885061
49.8%
G 6674
 
0.1%
E 6674
 
0.1%
N 6674
 
0.1%
T 6674
 
0.1%
L 424
 
< 0.1%
U 424
 
< 0.1%
A 3
 
< 0.1%
W 3
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
(unknown) 5797675
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
B 2885061
49.8%
R 2885061
49.8%
G 6674
 
0.1%
E 6674
 
0.1%
N 6674
 
0.1%
T 6674
 
0.1%
L 424
 
< 0.1%
U 424
 
< 0.1%
A 3
 
< 0.1%
W 3
 
< 0.1%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 5797675
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
B 2885061
49.8%
R 2885061
49.8%
G 6674
 
0.1%
E 6674
 
0.1%
N 6674
 
0.1%
T 6674
 
0.1%
L 424
 
< 0.1%
U 424
 
< 0.1%
A 3
 
< 0.1%
W 3
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 5797675
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
B 2885061
49.8%
R 2885061
49.8%
G 6674
 
0.1%
E 6674
 
0.1%
N 6674
 
0.1%
T 6674
 
0.1%
L 424
 
< 0.1%
U 424
 
< 0.1%
A 3
 
< 0.1%
W 3
 
< 0.1%

datasetName
Text

Constant 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size22.1 MiB
2025-03-14T13:56:55.565749image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length30
Median length30
Mean length30
Min length30

Characters and Unicode

Total characters86752140
Distinct characters18
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowMeise Botanic Garden Herbarium
2nd rowMeise Botanic Garden Herbarium
3rd rowMeise Botanic Garden Herbarium
4th rowMeise Botanic Garden Herbarium
5th rowMeise Botanic Garden Herbarium
ValueCountFrequency (%)
meise 2891738
25.0%
botanic 2891738
25.0%
garden 2891738
25.0%
herbarium 2891738
25.0%
2025-03-14T13:56:55.645787image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
e 11566952
13.3%
i 8675214
 
10.0%
8675214
 
10.0%
a 8675214
 
10.0%
r 8675214
 
10.0%
n 5783476
 
6.7%
G 2891738
 
3.3%
u 2891738
 
3.3%
b 2891738
 
3.3%
H 2891738
 
3.3%
Other values (8) 23133904
26.7%

Most occurring categories

ValueCountFrequency (%)
(unknown) 86752140
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
e 11566952
13.3%
i 8675214
 
10.0%
8675214
 
10.0%
a 8675214
 
10.0%
r 8675214
 
10.0%
n 5783476
 
6.7%
G 2891738
 
3.3%
u 2891738
 
3.3%
b 2891738
 
3.3%
H 2891738
 
3.3%
Other values (8) 23133904
26.7%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 86752140
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
e 11566952
13.3%
i 8675214
 
10.0%
8675214
 
10.0%
a 8675214
 
10.0%
r 8675214
 
10.0%
n 5783476
 
6.7%
G 2891738
 
3.3%
u 2891738
 
3.3%
b 2891738
 
3.3%
H 2891738
 
3.3%
Other values (8) 23133904
26.7%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 86752140
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
e 11566952
13.3%
i 8675214
 
10.0%
8675214
 
10.0%
a 8675214
 
10.0%
r 8675214
 
10.0%
n 5783476
 
6.7%
G 2891738
 
3.3%
u 2891738
 
3.3%
b 2891738
 
3.3%
H 2891738
 
3.3%
Other values (8) 23133904
26.7%
Distinct6
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size22.1 MiB
2025-03-14T13:56:55.677226image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length18
Median length17
Mean length16.97205037
Min length10

Characters and Unicode

Total characters49078723
Distinct characters24
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowPreservedSpecimen
2nd rowPreservedSpecimen
3rd rowPreservedSpecimen
4th rowPreservedSpecimen
5th rowPreservedSpecimen
ValueCountFrequency (%)
preservedspecimen 2856671
98.8%
materialsample 11727
 
0.4%
humanobservation 10957
 
0.4%
machineobservation 6497
 
0.2%
occurrence 5881
 
0.2%
livingspecimen 5
 
< 0.1%
2025-03-14T13:56:55.768425image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
e 14342532
29.2%
r 5754285
11.7%
n 2897470
 
5.9%
i 2892364
 
5.9%
c 2880816
 
5.9%
m 2879360
 
5.9%
v 2874130
 
5.9%
s 2874125
 
5.9%
S 2868403
 
5.8%
p 2868403
 
5.8%
Other values (14) 5946835
12.1%

Most occurring categories

ValueCountFrequency (%)
(unknown) 49078723
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
e 14342532
29.2%
r 5754285
11.7%
n 2897470
 
5.9%
i 2892364
 
5.9%
c 2880816
 
5.9%
m 2879360
 
5.9%
v 2874130
 
5.9%
s 2874125
 
5.9%
S 2868403
 
5.8%
p 2868403
 
5.8%
Other values (14) 5946835
12.1%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 49078723
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
e 14342532
29.2%
r 5754285
11.7%
n 2897470
 
5.9%
i 2892364
 
5.9%
c 2880816
 
5.9%
m 2879360
 
5.9%
v 2874130
 
5.9%
s 2874125
 
5.9%
S 2868403
 
5.8%
p 2868403
 
5.8%
Other values (14) 5946835
12.1%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 49078723
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
e 14342532
29.2%
r 5754285
11.7%
n 2897470
 
5.9%
i 2892364
 
5.9%
c 2880816
 
5.9%
m 2879360
 
5.9%
v 2874130
 
5.9%
s 2874125
 
5.9%
S 2868403
 
5.8%
p 2868403
 
5.8%
Other values (14) 5946835
12.1%

occurrenceID
Text

Unique 

Distinct2891738
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size22.1 MiB
2025-03-14T13:56:57.155357image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length61
Median length59
Mean length59.04975278
Min length54

Characters and Unicode

Total characters170756414
Distinct characters39
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2891738 ?
Unique (%)100.0%

Sample

1st rowhttp://www.botanicalcollections.be/specimen/BR0000035634553
2nd rowhttp://www.botanicalcollections.be/specimen/BR0000016406988
3rd rowhttp://www.botanicalcollections.be/specimen/BR0000035634607
4th rowhttp://www.botanicalcollections.be/specimen/BR0000016407022
5th rowhttp://www.botanicalcollections.be/specimen/BR0000035634652
ValueCountFrequency (%)
http://www.botanicalcollections.be/specimen/br0000035634553 1
 
< 0.1%
http://www.botanicalcollections.be/specimen/br0000035643517 1
 
< 0.1%
http://www.botanicalcollections.be/specimen/br0000016407473 1
 
< 0.1%
http://www.botanicalcollections.be/specimen/br0000016408012 1
 
< 0.1%
http://www.botanicalcollections.be/specimen/br0000035634805 1
 
< 0.1%
http://www.botanicalcollections.be/specimen/br0000035634607 1
 
< 0.1%
http://www.botanicalcollections.be/specimen/br0000016407022 1
 
< 0.1%
http://www.botanicalcollections.be/specimen/br0000035634652 1
 
< 0.1%
http://www.botanicalcollections.be/specimen/br0000016407077 1
 
< 0.1%
http://www.botanicalcollections.be/specimen/br0000035634706 1
 
< 0.1%
Other values (2891728) 2891728
> 99.9%
2025-03-14T13:56:58.626542image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 15784271
 
9.2%
t 11566952
 
6.8%
e 11566952
 
6.8%
/ 11566952
 
6.8%
c 11566952
 
6.8%
l 8675214
 
5.1%
i 8675214
 
5.1%
n 8675214
 
5.1%
o 8675214
 
5.1%
w 8675214
 
5.1%
Other values (29) 65328265
38.3%

Most occurring categories

ValueCountFrequency (%)
(unknown) 170756414
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
0 15784271
 
9.2%
t 11566952
 
6.8%
e 11566952
 
6.8%
/ 11566952
 
6.8%
c 11566952
 
6.8%
l 8675214
 
5.1%
i 8675214
 
5.1%
n 8675214
 
5.1%
o 8675214
 
5.1%
w 8675214
 
5.1%
Other values (29) 65328265
38.3%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 170756414
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
0 15784271
 
9.2%
t 11566952
 
6.8%
e 11566952
 
6.8%
/ 11566952
 
6.8%
c 11566952
 
6.8%
l 8675214
 
5.1%
i 8675214
 
5.1%
n 8675214
 
5.1%
o 8675214
 
5.1%
w 8675214
 
5.1%
Other values (29) 65328265
38.3%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 170756414
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
0 15784271
 
9.2%
t 11566952
 
6.8%
e 11566952
 
6.8%
/ 11566952
 
6.8%
c 11566952
 
6.8%
l 8675214
 
5.1%
i 8675214
 
5.1%
n 8675214
 
5.1%
o 8675214
 
5.1%
w 8675214
 
5.1%
Other values (29) 65328265
38.3%

catalogNumber
Text

Unique 

Distinct2891738
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size22.1 MiB
2025-03-14T13:56:59.915594image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length17
Median length15
Mean length15.04975278
Min length10

Characters and Unicode

Total characters43519942
Distinct characters22
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2891738 ?
Unique (%)100.0%

Sample

1st rowBR0000035634553
2nd rowBR0000016406988
3rd rowBR0000035634607
4th rowBR0000016407022
5th rowBR0000035634652
ValueCountFrequency (%)
br0000035634553 1
 
< 0.1%
br0000035643517 1
 
< 0.1%
br0000016407473 1
 
< 0.1%
br0000016408012 1
 
< 0.1%
br0000035634805 1
 
< 0.1%
br0000035634607 1
 
< 0.1%
br0000016407022 1
 
< 0.1%
br0000035634652 1
 
< 0.1%
br0000016407077 1
 
< 0.1%
br0000035634706 1
 
< 0.1%
Other values (2891728) 2891728
> 99.9%
2025-03-14T13:57:01.352484image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 15784271
36.3%
1 3308290
 
7.6%
2 2897190
 
6.7%
B 2883795
 
6.6%
R 2883795
 
6.6%
3 2697724
 
6.2%
5 2616918
 
6.0%
6 2107659
 
4.8%
4 2086039
 
4.8%
8 2076027
 
4.8%
Other values (12) 4178234
 
9.6%

Most occurring categories

ValueCountFrequency (%)
(unknown) 43519942
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
0 15784271
36.3%
1 3308290
 
7.6%
2 2897190
 
6.7%
B 2883795
 
6.6%
R 2883795
 
6.6%
3 2697724
 
6.2%
5 2616918
 
6.0%
6 2107659
 
4.8%
4 2086039
 
4.8%
8 2076027
 
4.8%
Other values (12) 4178234
 
9.6%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 43519942
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
0 15784271
36.3%
1 3308290
 
7.6%
2 2897190
 
6.7%
B 2883795
 
6.6%
R 2883795
 
6.6%
3 2697724
 
6.2%
5 2616918
 
6.0%
6 2107659
 
4.8%
4 2086039
 
4.8%
8 2076027
 
4.8%
Other values (12) 4178234
 
9.6%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 43519942
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
0 15784271
36.3%
1 3308290
 
7.6%
2 2897190
 
6.7%
B 2883795
 
6.6%
R 2883795
 
6.6%
3 2697724
 
6.2%
5 2616918
 
6.0%
6 2107659
 
4.8%
4 2086039
 
4.8%
8 2076027
 
4.8%
Other values (12) 4178234
 
9.6%

recordNumber
Text

Missing 

Distinct289442
Distinct (%)10.2%
Missing47728
Missing (%)1.7%
Memory size22.1 MiB
2025-03-14T13:57:01.565110image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length63
Median length4
Mean length4.127984079
Min length1

Characters and Unicode

Total characters11740028
Distinct characters155
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique207185 ?
Unique (%)7.3%

Sample

1st row328
2nd row102
3rd rowS.N.
4th row3439
5th rowS.N.
ValueCountFrequency (%)
s.n 903218
31.5%
b 4758
 
0.2%
2 4204
 
0.1%
1 4125
 
0.1%
3738
 
0.1%
3 3064
 
0.1%
4 2577
 
0.1%
5 2399
 
0.1%
6 2162
 
0.1%
7 2148
 
0.1%
Other values (270414) 1933042
67.5%
2025-03-14T13:57:01.820637image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
. 1865500
15.9%
1 1185345
10.1%
S 927351
 
7.9%
N 915439
 
7.8%
2 887755
 
7.6%
3 780653
 
6.6%
4 728206
 
6.2%
5 684874
 
5.8%
6 677204
 
5.8%
7 664341
 
5.7%
Other values (145) 2423360
20.6%

Most occurring categories

ValueCountFrequency (%)
(unknown) 11740028
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
. 1865500
15.9%
1 1185345
10.1%
S 927351
 
7.9%
N 915439
 
7.8%
2 887755
 
7.6%
3 780653
 
6.6%
4 728206
 
6.2%
5 684874
 
5.8%
6 677204
 
5.8%
7 664341
 
5.7%
Other values (145) 2423360
20.6%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 11740028
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
. 1865500
15.9%
1 1185345
10.1%
S 927351
 
7.9%
N 915439
 
7.8%
2 887755
 
7.6%
3 780653
 
6.6%
4 728206
 
6.2%
5 684874
 
5.8%
6 677204
 
5.8%
7 664341
 
5.7%
Other values (145) 2423360
20.6%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 11740028
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
. 1865500
15.9%
1 1185345
10.1%
S 927351
 
7.9%
N 915439
 
7.8%
2 887755
 
7.6%
3 780653
 
6.6%
4 728206
 
6.2%
5 684874
 
5.8%
6 677204
 
5.8%
7 664341
 
5.7%
Other values (145) 2423360
20.6%

recordedBy
Text

Missing 

Distinct94792
Distinct (%)3.3%
Missing31431
Missing (%)1.1%
Memory size22.1 MiB
2025-03-14T13:57:01.967092image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length200
Median length190
Mean length13.31852525
Min length1

Characters and Unicode

Total characters38095071
Distinct characters270
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique63180 ?
Unique (%)2.2%

Sample

1st rowTodaro A.
2nd rowSenderayi E.
3rd rowNitka J.
4th rowGermishuizen G.
5th rowVašák V.
ValueCountFrequency (%)
j 329023
 
4.7%
a 256058
 
3.7%
230724
 
3.3%
de 179854
 
2.6%
h 161236
 
2.3%
m 150913
 
2.2%
f 136040
 
1.9%
p 135877
 
1.9%
g 135464
 
1.9%
c 132389
 
1.9%
Other values (56392) 5155262
73.6%
2025-03-14T13:57:02.201884image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
4142533
 
10.9%
. 3802645
 
10.0%
e 3297717
 
8.7%
n 2106270
 
5.5%
a 1921214
 
5.0%
r 1842397
 
4.8%
o 1681496
 
4.4%
i 1363258
 
3.6%
l 1345083
 
3.5%
s 1220920
 
3.2%
Other values (260) 15371538
40.4%

Most occurring categories

ValueCountFrequency (%)
(unknown) 38095071
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
4142533
 
10.9%
. 3802645
 
10.0%
e 3297717
 
8.7%
n 2106270
 
5.5%
a 1921214
 
5.0%
r 1842397
 
4.8%
o 1681496
 
4.4%
i 1363258
 
3.6%
l 1345083
 
3.5%
s 1220920
 
3.2%
Other values (260) 15371538
40.4%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 38095071
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
4142533
 
10.9%
. 3802645
 
10.0%
e 3297717
 
8.7%
n 2106270
 
5.5%
a 1921214
 
5.0%
r 1842397
 
4.8%
o 1681496
 
4.4%
i 1363258
 
3.6%
l 1345083
 
3.5%
s 1220920
 
3.2%
Other values (260) 15371538
40.4%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 38095071
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
4142533
 
10.9%
. 3802645
 
10.0%
e 3297717
 
8.7%
n 2106270
 
5.5%
a 1921214
 
5.0%
r 1842397
 
4.8%
o 1681496
 
4.4%
i 1363258
 
3.6%
l 1345083
 
3.5%
s 1220920
 
3.2%
Other values (260) 15371538
40.4%

recordedByID
Text

Missing 

Distinct1711
Distinct (%)0.1%
Missing1180976
Missing (%)40.8%
Memory size22.1 MiB
2025-03-14T13:57:02.258736image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length87
Median length81
Mean length41.39642335
Min length26

Characters and Unicode

Total characters70819428
Distinct characters48
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique66 ?
Unique (%)< 0.1%

Sample

1st rowhttp://viaf.org/viaf/10628368
2nd rowhttp://purl.oclc.org/net/edu.harvard.huh/guid/uuid/8faac441-8d05-4bab-9c22-c97cd7df1145
3rd rowhttp://viaf.org/viaf/22916664
4th rowhttps://kiki.huh.harvard.edu/databases/botanist_search.php?mode=details&id=5495
5th rowhttp://www.wikidata.org/entity/Q6491480
ValueCountFrequency (%)
http://viaf.org/viaf/89445626 41445
 
2.4%
https://orcid.org/0000-0001-7949-2594 32096
 
1.9%
http://viaf.org/viaf/289994763 30837
 
1.8%
http://viaf.org/viaf/115629741 28790
 
1.7%
http://viaf.org/viaf/51826391 28617
 
1.7%
http://viaf.org/viaf/166699679 23812
 
1.4%
http://viaf.org/viaf/111968476 21656
 
1.3%
https://orcid.org/0000-0003-0223-8496 21429
 
1.3%
http://viaf.org/viaf/36967258 20405
 
1.2%
http://viaf.org/viaf/115629067 19610
 
1.1%
Other values (1701) 1442065
84.3%
2025-03-14T13:57:02.378512image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
/ 6852006
 
9.7%
t 5000725
 
7.1%
i 4606526
 
6.5%
a 4468942
 
6.3%
h 3220871
 
4.5%
. 2948214
 
4.2%
r 2713195
 
3.8%
p 2449425
 
3.5%
o 2378902
 
3.4%
d 2348927
 
3.3%
Other values (38) 33831695
47.8%

Most occurring categories

ValueCountFrequency (%)
(unknown) 70819428
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
/ 6852006
 
9.7%
t 5000725
 
7.1%
i 4606526
 
6.5%
a 4468942
 
6.3%
h 3220871
 
4.5%
. 2948214
 
4.2%
r 2713195
 
3.8%
p 2449425
 
3.5%
o 2378902
 
3.4%
d 2348927
 
3.3%
Other values (38) 33831695
47.8%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 70819428
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
/ 6852006
 
9.7%
t 5000725
 
7.1%
i 4606526
 
6.5%
a 4468942
 
6.3%
h 3220871
 
4.5%
. 2948214
 
4.2%
r 2713195
 
3.8%
p 2449425
 
3.5%
o 2378902
 
3.4%
d 2348927
 
3.3%
Other values (38) 33831695
47.8%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 70819428
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
/ 6852006
 
9.7%
t 5000725
 
7.1%
i 4606526
 
6.5%
a 4468942
 
6.3%
h 3220871
 
4.5%
. 2948214
 
4.2%
r 2713195
 
3.8%
p 2449425
 
3.5%
o 2378902
 
3.4%
d 2348927
 
3.3%
Other values (38) 33831695
47.8%
Distinct35
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size22.1 MiB
2025-03-14T13:57:02.419840image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length35
Median length14
Mean length13.94403366
Min length3

Characters and Unicode

Total characters40322492
Distinct characters36
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2 ?
Unique (%)< 0.1%

Sample

1st rowHerbariumSheet
2nd rowHerbariumSheet
3rd rowHerbariumSheet
4th rowHerbariumSheet
5th rowHerbariumSheet
ValueCountFrequency (%)
herbariumsheet 2832166
97.2%
liquidpreserved 11712
 
0.4%
silica 10869
 
0.4%
gel 10869
 
0.4%
seed 6459
 
0.2%
description 6244
 
0.2%
unknown 5881
 
0.2%
photograph:b&w 5153
 
0.2%
drawing 4653
 
0.2%
in 4596
 
0.2%
Other values (36) 13825
 
0.5%
2025-03-14T13:57:02.509739image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
e 8571144
21.3%
r 5713561
14.2%
i 2906249
 
7.2%
a 2862235
 
7.1%
S 2849737
 
7.1%
h 2849013
 
7.1%
u 2848643
 
7.1%
t 2848607
 
7.1%
b 2842533
 
7.0%
m 2834471
 
7.0%
Other values (26) 3196299
 
7.9%

Most occurring categories

ValueCountFrequency (%)
(unknown) 40322492
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
e 8571144
21.3%
r 5713561
14.2%
i 2906249
 
7.2%
a 2862235
 
7.1%
S 2849737
 
7.1%
h 2849013
 
7.1%
u 2848643
 
7.1%
t 2848607
 
7.1%
b 2842533
 
7.0%
m 2834471
 
7.0%
Other values (26) 3196299
 
7.9%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 40322492
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
e 8571144
21.3%
r 5713561
14.2%
i 2906249
 
7.2%
a 2862235
 
7.1%
S 2849737
 
7.1%
h 2849013
 
7.1%
u 2848643
 
7.1%
t 2848607
 
7.1%
b 2842533
 
7.0%
m 2834471
 
7.0%
Other values (26) 3196299
 
7.9%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 40322492
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
e 8571144
21.3%
r 5713561
14.2%
i 2906249
 
7.2%
a 2862235
 
7.1%
S 2849737
 
7.1%
h 2849013
 
7.1%
u 2848643
 
7.1%
t 2848607
 
7.1%
b 2842533
 
7.0%
m 2834471
 
7.0%
Other values (26) 3196299
 
7.9%

eventDate
Text

Missing 

Distinct69404
Distinct (%)3.1%
Missing660123
Missing (%)22.8%
Memory size22.1 MiB
2025-03-14T13:57:02.652685image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length21
Median length10
Mean length9.128450024
Min length4

Characters and Unicode

Total characters20371186
Distinct characters12
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique9517 ?
Unique (%)0.4%

Sample

1st row1977-03-30
2nd row1960-08-05
3rd row1978-08-05
4th row1826
5th row1964-07-03
ValueCountFrequency (%)
1913 3110
 
0.1%
1840 2639
 
0.1%
1868 2204
 
0.1%
1836 2172
 
0.1%
1900 2156
 
0.1%
1840-04 2092
 
0.1%
1925 2039
 
0.1%
1906 1976
 
0.1%
1921 1968
 
0.1%
1936 1959
 
0.1%
Other values (69394) 2209300
99.0%
2025-03-14T13:57:02.873097image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
- 3788642
18.6%
1 3765026
18.5%
0 3137516
15.4%
9 2465459
12.1%
8 1423662
 
7.0%
2 1391623
 
6.8%
7 1061621
 
5.2%
6 941031
 
4.6%
5 929132
 
4.6%
3 756892
 
3.7%
Other values (2) 710582
 
3.5%

Most occurring categories

ValueCountFrequency (%)
(unknown) 20371186
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
- 3788642
18.6%
1 3765026
18.5%
0 3137516
15.4%
9 2465459
12.1%
8 1423662
 
7.0%
2 1391623
 
6.8%
7 1061621
 
5.2%
6 941031
 
4.6%
5 929132
 
4.6%
3 756892
 
3.7%
Other values (2) 710582
 
3.5%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 20371186
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
- 3788642
18.6%
1 3765026
18.5%
0 3137516
15.4%
9 2465459
12.1%
8 1423662
 
7.0%
2 1391623
 
6.8%
7 1061621
 
5.2%
6 941031
 
4.6%
5 929132
 
4.6%
3 756892
 
3.7%
Other values (2) 710582
 
3.5%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 20371186
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
- 3788642
18.6%
1 3765026
18.5%
0 3137516
15.4%
9 2465459
12.1%
8 1423662
 
7.0%
2 1391623
 
6.8%
7 1061621
 
5.2%
6 941031
 
4.6%
5 929132
 
4.6%
3 756892
 
3.7%
Other values (2) 710582
 
3.5%

year
Text

Missing 

Distinct278
Distinct (%)< 0.1%
Missing665278
Missing (%)23.0%
Memory size22.1 MiB
2025-03-14T13:57:03.024286image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length4
Median length4
Mean length4
Min length4

Characters and Unicode

Total characters8905840
Distinct characters10
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique11 ?
Unique (%)< 0.1%

Sample

1st row1977
2nd row1960
3rd row1978
4th row1826
5th row1964
ValueCountFrequency (%)
1958 33489
 
1.5%
1959 32555
 
1.5%
1974 30686
 
1.4%
1957 30590
 
1.4%
1971 29351
 
1.3%
1969 28697
 
1.3%
1972 28456
 
1.3%
1975 28219
 
1.3%
1952 27662
 
1.2%
1970 27353
 
1.2%
Other values (268) 1929402
86.7%
2025-03-14T13:57:03.223883image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 2422490
27.2%
9 2118041
23.8%
8 987391
11.1%
0 574962
 
6.5%
7 565748
 
6.4%
6 505266
 
5.7%
5 504729
 
5.7%
2 478490
 
5.4%
3 396650
 
4.5%
4 352073
 
4.0%

Most occurring categories

ValueCountFrequency (%)
(unknown) 8905840
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
1 2422490
27.2%
9 2118041
23.8%
8 987391
11.1%
0 574962
 
6.5%
7 565748
 
6.4%
6 505266
 
5.7%
5 504729
 
5.7%
2 478490
 
5.4%
3 396650
 
4.5%
4 352073
 
4.0%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 8905840
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
1 2422490
27.2%
9 2118041
23.8%
8 987391
11.1%
0 574962
 
6.5%
7 565748
 
6.4%
6 505266
 
5.7%
5 504729
 
5.7%
2 478490
 
5.4%
3 396650
 
4.5%
4 352073
 
4.0%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 8905840
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
1 2422490
27.2%
9 2118041
23.8%
8 987391
11.1%
0 574962
 
6.5%
7 565748
 
6.4%
6 505266
 
5.7%
5 504729
 
5.7%
2 478490
 
5.4%
3 396650
 
4.5%
4 352073
 
4.0%

month
Text

Missing 

Distinct12
Distinct (%)< 0.1%
Missing817758
Missing (%)28.3%
Memory size22.1 MiB
2025-03-14T13:57:03.274359image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length2
Median length2
Mean length2
Min length2

Characters and Unicode

Total characters4147960
Distinct characters10
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row03
2nd row08
3rd row08
4th row07
5th row11
ValueCountFrequency (%)
07 325107
15.7%
06 263779
12.7%
08 253353
12.2%
05 239410
11.5%
04 173684
8.4%
09 171099
8.2%
10 133202
6.4%
03 116065
 
5.6%
11 111546
 
5.4%
02 97952
 
4.7%
Other values (2) 188783
9.1%
2025-03-14T13:57:03.358190image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 1870136
45.1%
1 545077
 
13.1%
7 325107
 
7.8%
6 263779
 
6.4%
8 253353
 
6.1%
5 239410
 
5.8%
2 190250
 
4.6%
4 173684
 
4.2%
9 171099
 
4.1%
3 116065
 
2.8%

Most occurring categories

ValueCountFrequency (%)
(unknown) 4147960
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
0 1870136
45.1%
1 545077
 
13.1%
7 325107
 
7.8%
6 263779
 
6.4%
8 253353
 
6.1%
5 239410
 
5.8%
2 190250
 
4.6%
4 173684
 
4.2%
9 171099
 
4.1%
3 116065
 
2.8%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 4147960
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
0 1870136
45.1%
1 545077
 
13.1%
7 325107
 
7.8%
6 263779
 
6.4%
8 253353
 
6.1%
5 239410
 
5.8%
2 190250
 
4.6%
4 173684
 
4.2%
9 171099
 
4.1%
3 116065
 
2.8%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 4147960
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
0 1870136
45.1%
1 545077
 
13.1%
7 325107
 
7.8%
6 263779
 
6.4%
8 253353
 
6.1%
5 239410
 
5.8%
2 190250
 
4.6%
4 173684
 
4.2%
9 171099
 
4.1%
3 116065
 
2.8%

day
Text

Missing 

Distinct31
Distinct (%)< 0.1%
Missing1213493
Missing (%)42.0%
Memory size22.1 MiB
2025-03-14T13:57:03.400108image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length2
Median length2
Mean length2
Min length2

Characters and Unicode

Total characters3356490
Distinct characters10
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row30
2nd row05
3rd row05
4th row03
5th row11
ValueCountFrequency (%)
20 62502
 
3.7%
10 62496
 
3.7%
15 62363
 
3.7%
25 56845
 
3.4%
18 56728
 
3.4%
21 56520
 
3.4%
12 56500
 
3.4%
17 56164
 
3.3%
16 55887
 
3.3%
01 55845
 
3.3%
Other values (21) 1096395
65.3%
2025-03-14T13:57:03.491543image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 759805
22.6%
2 709802
21.1%
0 663853
19.8%
3 235955
 
7.0%
5 175032
 
5.2%
8 165411
 
4.9%
6 163922
 
4.9%
7 163734
 
4.9%
4 162164
 
4.8%
9 156812
 
4.7%

Most occurring categories

ValueCountFrequency (%)
(unknown) 3356490
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
1 759805
22.6%
2 709802
21.1%
0 663853
19.8%
3 235955
 
7.0%
5 175032
 
5.2%
8 165411
 
4.9%
6 163922
 
4.9%
7 163734
 
4.9%
4 162164
 
4.8%
9 156812
 
4.7%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 3356490
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
1 759805
22.6%
2 709802
21.1%
0 663853
19.8%
3 235955
 
7.0%
5 175032
 
5.2%
8 165411
 
4.9%
6 163922
 
4.9%
7 163734
 
4.9%
4 162164
 
4.8%
9 156812
 
4.7%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 3356490
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
1 759805
22.6%
2 709802
21.1%
0 663853
19.8%
3 235955
 
7.0%
5 175032
 
5.2%
8 165411
 
4.9%
6 163922
 
4.9%
7 163734
 
4.9%
4 162164
 
4.8%
9 156812
 
4.7%

verbatimEventDate
Text

Missing 

Distinct192476
Distinct (%)11.7%
Missing1240382
Missing (%)42.9%
Memory size22.1 MiB
2025-03-14T13:57:03.562755image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length322
Median length189
Mean length9.082652681
Min length1

Characters and Unicode

Total characters14998693
Distinct characters153
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique84355 ?
Unique (%)5.1%

Sample

1st row[?/6/?]
2nd row[5/8/1960]
3rd row(06/11/85)
4th row[5/8/1978]
5th row[3/7/1964]
ValueCountFrequency (%)
s.d 315687
 
16.4%
00000000 27587
 
1.4%
13075
 
0.7%
aug 10550
 
0.5%
jul 8299
 
0.4%
juin 7614
 
0.4%
mai 7355
 
0.4%
juillet 7108
 
0.4%
may 4894
 
0.3%
et 4650
 
0.2%
Other values (155656) 1518390
78.9%
2025-03-14T13:57:03.715821image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 2026727
13.5%
/ 1642689
 
11.0%
9 1336495
 
8.9%
0 1185437
 
7.9%
8 993078
 
6.6%
[ 754780
 
5.0%
] 754769
 
5.0%
2 695131
 
4.6%
. 685373
 
4.6%
7 656565
 
4.4%
Other values (143) 4267649
28.5%

Most occurring categories

ValueCountFrequency (%)
(unknown) 14998693
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
1 2026727
13.5%
/ 1642689
 
11.0%
9 1336495
 
8.9%
0 1185437
 
7.9%
8 993078
 
6.6%
[ 754780
 
5.0%
] 754769
 
5.0%
2 695131
 
4.6%
. 685373
 
4.6%
7 656565
 
4.4%
Other values (143) 4267649
28.5%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 14998693
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
1 2026727
13.5%
/ 1642689
 
11.0%
9 1336495
 
8.9%
0 1185437
 
7.9%
8 993078
 
6.6%
[ 754780
 
5.0%
] 754769
 
5.0%
2 695131
 
4.6%
. 685373
 
4.6%
7 656565
 
4.4%
Other values (143) 4267649
28.5%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 14998693
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
1 2026727
13.5%
/ 1642689
 
11.0%
9 1336495
 
8.9%
0 1185437
 
7.9%
8 993078
 
6.6%
[ 754780
 
5.0%
] 754769
 
5.0%
2 695131
 
4.6%
. 685373
 
4.6%
7 656565
 
4.4%
Other values (143) 4267649
28.5%

habitat
Text

Missing 

Distinct226986
Distinct (%)43.3%
Missing2368064
Missing (%)81.9%
Memory size22.1 MiB
2025-03-14T13:57:03.869783image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length1207
Median length572
Mean length27.86994963
Min length1

Characters and Unicode

Total characters14594768
Distinct characters215
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique180540 ?
Unique (%)34.5%

Sample

1st rowgrowing with Aloe chibaudii towards base of rock mass in open.
2nd rowIm Sumpfgebiet. In[]die Hypericum-Stamm-Geflechte.
3rd rowField layer of recently exploited forest, forming dense dominant areas in certain places.
4th rowhills behind Small Lake. On exposed hillside
5th rowOn lateritic plain.
ValueCountFrequency (%)
de 87844
 
3.8%
forêt 54802
 
2.4%
in 53572
 
2.3%
forest 39409
 
1.7%
on 38664
 
1.7%
la 33358
 
1.5%
33224
 
1.4%
sur 29056
 
1.3%
of 23484
 
1.0%
à 23111
 
1.0%
Other values (79333) 1883783
81.9%
2025-03-14T13:57:04.119426image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1776627
 
12.2%
e 1511500
 
10.4%
r 1027458
 
7.0%
a 1026379
 
7.0%
o 892082
 
6.1%
s 866019
 
5.9%
i 828313
 
5.7%
n 793138
 
5.4%
t 685486
 
4.7%
l 596987
 
4.1%
Other values (205) 4590779
31.5%

Most occurring categories

ValueCountFrequency (%)
(unknown) 14594768
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
1776627
 
12.2%
e 1511500
 
10.4%
r 1027458
 
7.0%
a 1026379
 
7.0%
o 892082
 
6.1%
s 866019
 
5.9%
i 828313
 
5.7%
n 793138
 
5.4%
t 685486
 
4.7%
l 596987
 
4.1%
Other values (205) 4590779
31.5%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 14594768
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
1776627
 
12.2%
e 1511500
 
10.4%
r 1027458
 
7.0%
a 1026379
 
7.0%
o 892082
 
6.1%
s 866019
 
5.9%
i 828313
 
5.7%
n 793138
 
5.4%
t 685486
 
4.7%
l 596987
 
4.1%
Other values (205) 4590779
31.5%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 14594768
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
1776627
 
12.2%
e 1511500
 
10.4%
r 1027458
 
7.0%
a 1026379
 
7.0%
o 892082
 
6.1%
s 866019
 
5.9%
i 828313
 
5.7%
n 793138
 
5.4%
t 685486
 
4.7%
l 596987
 
4.1%
Other values (205) 4590779
31.5%

country
Text

Missing 

Distinct252
Distinct (%)< 0.1%
Missing35559
Missing (%)1.2%
Memory size22.1 MiB
2025-03-14T13:57:04.261827image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length38
Median length33
Mean length12.23389535
Min length3

Characters and Unicode

Total characters34942195
Distinct characters63
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique5 ?
Unique (%)< 0.1%

Sample

1st rowItaly
2nd rowZimbabwe
3rd rowCzech Republic
4th rowSouth Africa
5th rowSlovakia
ValueCountFrequency (%)
republic 512534
 
9.8%
congo 496202
 
9.5%
belgium 493562
 
9.5%
of 490735
 
9.4%
the 490428
 
9.4%
democratic 490338
 
9.4%
france 285843
 
5.5%
country 174736
 
3.3%
unknown 174736
 
3.3%
usa 72839
 
1.4%
Other values (300) 1534738
29.4%
2025-03-14T13:57:04.525241image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
e 3010498
 
8.6%
o 2834914
 
8.1%
n 2425314
 
6.9%
a 2420669
 
6.9%
i 2394162
 
6.9%
2360512
 
6.8%
c 1989403
 
5.7%
u 1712033
 
4.9%
r 1605087
 
4.6%
t 1511096
 
4.3%
Other values (53) 12678507
36.3%

Most occurring categories

ValueCountFrequency (%)
(unknown) 34942195
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
e 3010498
 
8.6%
o 2834914
 
8.1%
n 2425314
 
6.9%
a 2420669
 
6.9%
i 2394162
 
6.9%
2360512
 
6.8%
c 1989403
 
5.7%
u 1712033
 
4.9%
r 1605087
 
4.6%
t 1511096
 
4.3%
Other values (53) 12678507
36.3%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 34942195
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
e 3010498
 
8.6%
o 2834914
 
8.1%
n 2425314
 
6.9%
a 2420669
 
6.9%
i 2394162
 
6.9%
2360512
 
6.8%
c 1989403
 
5.7%
u 1712033
 
4.9%
r 1605087
 
4.6%
t 1511096
 
4.3%
Other values (53) 12678507
36.3%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 34942195
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
e 3010498
 
8.6%
o 2834914
 
8.1%
n 2425314
 
6.9%
a 2420669
 
6.9%
i 2394162
 
6.9%
2360512
 
6.8%
c 1989403
 
5.7%
u 1712033
 
4.9%
r 1605087
 
4.6%
t 1511096
 
4.3%
Other values (53) 12678507
36.3%

countryCode
Text

Missing 

Distinct277
Distinct (%)< 0.1%
Missing39980
Missing (%)1.4%
Memory size22.1 MiB
2025-03-14T13:57:04.680848image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length2
Median length2
Mean length2
Min length2

Characters and Unicode

Total characters5703516
Distinct characters37
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique20 ?
Unique (%)< 0.1%

Sample

1st rowIT
2nd rowZW
3rd rowCZ
4th rowZA
5th rowSK
ValueCountFrequency (%)
be 493563
17.3%
cd 490338
17.2%
fr 285846
 
10.0%
zz 174726
 
6.1%
us 72839
 
2.6%
tz 61585
 
2.2%
br 61255
 
2.1%
es 58476
 
2.1%
cm 46377
 
1.6%
de 45578
 
1.6%
Other values (258) 1061175
37.2%
2025-03-14T13:57:04.857006image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
E 698854
12.3%
C 697664
12.2%
B 639457
11.2%
D 573136
10.0%
Z 526591
9.2%
R 492153
8.6%
F 314981
 
5.5%
M 190281
 
3.3%
S 188449
 
3.3%
T 187720
 
3.3%
Other values (27) 1194230
20.9%

Most occurring categories

ValueCountFrequency (%)
(unknown) 5703516
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
E 698854
12.3%
C 697664
12.2%
B 639457
11.2%
D 573136
10.0%
Z 526591
9.2%
R 492153
8.6%
F 314981
 
5.5%
M 190281
 
3.3%
S 188449
 
3.3%
T 187720
 
3.3%
Other values (27) 1194230
20.9%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 5703516
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
E 698854
12.3%
C 697664
12.2%
B 639457
11.2%
D 573136
10.0%
Z 526591
9.2%
R 492153
8.6%
F 314981
 
5.5%
M 190281
 
3.3%
S 188449
 
3.3%
T 187720
 
3.3%
Other values (27) 1194230
20.9%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 5703516
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
E 698854
12.3%
C 697664
12.2%
B 639457
11.2%
D 573136
10.0%
Z 526591
9.2%
R 492153
8.6%
F 314981
 
5.5%
M 190281
 
3.3%
S 188449
 
3.3%
T 187720
 
3.3%
Other values (27) 1194230
20.9%

locality
Text

Missing 

Distinct1257110
Distinct (%)47.2%
Missing229370
Missing (%)7.9%
Memory size22.1 MiB
2025-03-14T13:57:05.341965image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length483
Median length321
Mean length31.94652542
Min length1

Characters and Unicode

Total characters85053407
Distinct characters752
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1048048 ?
Unique (%)39.4%

Sample

1st rowSicula; In montosis - Piana dei Greci
2nd rowDistrict Victoria, Makaholi Experiment Station, in the Legume Pasture Trial Plats in camp 17
3rd rowBohemia centr., distr.Praha: in locis graminosis prope vicum Modletice
4th rowRegio Transvaal, 2329BB, Louis Trichardt, open plot in town, Grassland
5th rowSlovakia orientalis: regio Slovenský kras: planities Silická planina, in colle Čertova Diera
ValueCountFrequency (%)
de 648757
 
5.0%
la 233643
 
1.8%
s.l 224392
 
1.7%
of 172601
 
1.3%
166615
 
1.3%
du 158876
 
1.2%
km 157771
 
1.2%
in 117487
 
0.9%
à 106405
 
0.8%
près 78146
 
0.6%
Other values (615423) 10988219
84.2%
2025-03-14T13:57:05.887961image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
10390496
 
12.2%
e 7258017
 
8.5%
a 6967102
 
8.2%
r 4575344
 
5.4%
o 4471530
 
5.3%
i 4439039
 
5.2%
n 4352217
 
5.1%
s 3375868
 
4.0%
t 3271335
 
3.8%
l 3183238
 
3.7%
Other values (742) 32769221
38.5%

Most occurring categories

ValueCountFrequency (%)
(unknown) 85053407
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
10390496
 
12.2%
e 7258017
 
8.5%
a 6967102
 
8.2%
r 4575344
 
5.4%
o 4471530
 
5.3%
i 4439039
 
5.2%
n 4352217
 
5.1%
s 3375868
 
4.0%
t 3271335
 
3.8%
l 3183238
 
3.7%
Other values (742) 32769221
38.5%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 85053407
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
10390496
 
12.2%
e 7258017
 
8.5%
a 6967102
 
8.2%
r 4575344
 
5.4%
o 4471530
 
5.3%
i 4439039
 
5.2%
n 4352217
 
5.1%
s 3375868
 
4.0%
t 3271335
 
3.8%
l 3183238
 
3.7%
Other values (742) 32769221
38.5%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 85053407
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
10390496
 
12.2%
e 7258017
 
8.5%
a 6967102
 
8.2%
r 4575344
 
5.4%
o 4471530
 
5.3%
i 4439039
 
5.2%
n 4352217
 
5.1%
s 3375868
 
4.0%
t 3271335
 
3.8%
l 3183238
 
3.7%
Other values (742) 32769221
38.5%

locationRemarks
Text

Missing 

Distinct259911
Distinct (%)18.3%
Missing1469995
Missing (%)50.8%
Memory size22.1 MiB
2025-03-14T13:57:06.062575image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length1949
Median length1467
Mean length30.2774334
Min length10

Characters and Unicode

Total characters43046729
Distinct characters322
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique214057 ?
Unique (%)15.1%

Sample

1st rowCountry: Rhodesia
2nd rowCountry: Bohemia
3rd rowCountry: Transvaal
4th rowCountry: Slovakia
5th rowCountry: C.b.s.
ValueCountFrequency (%)
country 1054900
 
18.0%
substate 202687
 
3.5%
description 200328
 
3.4%
congo 105644
 
1.8%
belge 92850
 
1.6%
de 74635
 
1.3%
france 68322
 
1.2%
on 47929
 
0.8%
vegetation 47894
 
0.8%
green 46343
 
0.8%
Other values (112983) 3907769
66.8%
2025-03-14T13:57:06.318370image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
4427545
 
10.3%
e 3434906
 
8.0%
n 2984497
 
6.9%
r 2966677
 
6.9%
t 2899812
 
6.7%
o 2763473
 
6.4%
a 2484957
 
5.8%
u 2278609
 
5.3%
i 2072307
 
4.8%
s 1841098
 
4.3%
Other values (312) 14892848
34.6%

Most occurring categories

ValueCountFrequency (%)
(unknown) 43046729
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
4427545
 
10.3%
e 3434906
 
8.0%
n 2984497
 
6.9%
r 2966677
 
6.9%
t 2899812
 
6.7%
o 2763473
 
6.4%
a 2484957
 
5.8%
u 2278609
 
5.3%
i 2072307
 
4.8%
s 1841098
 
4.3%
Other values (312) 14892848
34.6%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 43046729
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
4427545
 
10.3%
e 3434906
 
8.0%
n 2984497
 
6.9%
r 2966677
 
6.9%
t 2899812
 
6.7%
o 2763473
 
6.4%
a 2484957
 
5.8%
u 2278609
 
5.3%
i 2072307
 
4.8%
s 1841098
 
4.3%
Other values (312) 14892848
34.6%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 43046729
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
4427545
 
10.3%
e 3434906
 
8.0%
n 2984497
 
6.9%
r 2966677
 
6.9%
t 2899812
 
6.7%
o 2763473
 
6.4%
a 2484957
 
5.8%
u 2278609
 
5.3%
i 2072307
 
4.8%
s 1841098
 
4.3%
Other values (312) 14892848
34.6%

decimalLatitude
Text

Missing 

Distinct43086
Distinct (%)6.2%
Missing2194373
Missing (%)75.9%
Memory size22.1 MiB
2025-03-14T13:57:06.464654image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length10
Median length9
Mean length8.21867028
Min length1

Characters and Unicode

Total characters5731413
Distinct characters12
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique17463 ?
Unique (%)2.5%

Sample

1st row9
2nd row4.033333
3rd row-29.316667
4th row-3.833333
5th row64.020833
ValueCountFrequency (%)
0.766667 8710
 
1.2%
0.056944 5745
 
0.8%
5.133333 4120
 
0.6%
0.767499 3850
 
0.6%
0.05 3766
 
0.5%
3.444444 2536
 
0.4%
2.883333 2371
 
0.3%
4 2312
 
0.3%
5.137777 2023
 
0.3%
1.416667 2021
 
0.3%
Other values (40528) 659911
94.6%
2025-03-14T13:57:06.672817image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
3 761899
13.3%
. 686873
12.0%
6 635076
11.1%
5 590336
10.3%
1 544979
9.5%
0 413087
7.2%
7 405021
7.1%
9 368457
6.4%
2 362241
6.3%
4 357674
6.2%
Other values (2) 605770
10.6%

Most occurring categories

ValueCountFrequency (%)
(unknown) 5731413
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
3 761899
13.3%
. 686873
12.0%
6 635076
11.1%
5 590336
10.3%
1 544979
9.5%
0 413087
7.2%
7 405021
7.1%
9 368457
6.4%
2 362241
6.3%
4 357674
6.2%
Other values (2) 605770
10.6%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 5731413
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
3 761899
13.3%
. 686873
12.0%
6 635076
11.1%
5 590336
10.3%
1 544979
9.5%
0 413087
7.2%
7 405021
7.1%
9 368457
6.4%
2 362241
6.3%
4 357674
6.2%
Other values (2) 605770
10.6%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 5731413
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
3 761899
13.3%
. 686873
12.0%
6 635076
11.1%
5 590336
10.3%
1 544979
9.5%
0 413087
7.2%
7 405021
7.1%
9 368457
6.4%
2 362241
6.3%
4 357674
6.2%
Other values (2) 605770
10.6%

decimalLongitude
Text

Missing 

Distinct50224
Distinct (%)7.2%
Missing2193931
Missing (%)75.9%
Memory size22.1 MiB
2025-03-14T13:57:06.835515image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length11
Median length10
Mean length7.854466923
Min length1

Characters and Unicode

Total characters5480902
Distinct characters12
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique19271 ?
Unique (%)2.8%

Sample

1st row39.916667
2nd row32.85
3rd row2.5
4th row29.779166
5th row-21.218888
ValueCountFrequency (%)
24.45 8124
 
1.2%
18.309166 5576
 
0.8%
24.446944 3845
 
0.6%
18.316667 3634
 
0.5%
15.1 3165
 
0.5%
25.691944 2526
 
0.4%
29.55 2157
 
0.3%
0 2153
 
0.3%
15.071666 2023
 
0.3%
27.466667 2010
 
0.3%
Other values (48234) 662594
95.0%
2025-03-14T13:57:07.059904image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
3 797498
14.6%
. 689310
12.6%
6 653489
11.9%
4 501904
9.2%
2 472513
8.6%
1 456827
8.3%
5 453390
8.3%
7 421527
7.7%
9 400032
7.3%
8 387266
7.1%
Other values (2) 247146
 
4.5%

Most occurring categories

ValueCountFrequency (%)
(unknown) 5480902
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
3 797498
14.6%
. 689310
12.6%
6 653489
11.9%
4 501904
9.2%
2 472513
8.6%
1 456827
8.3%
5 453390
8.3%
7 421527
7.7%
9 400032
7.3%
8 387266
7.1%
Other values (2) 247146
 
4.5%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 5480902
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
3 797498
14.6%
. 689310
12.6%
6 653489
11.9%
4 501904
9.2%
2 472513
8.6%
1 456827
8.3%
5 453390
8.3%
7 421527
7.7%
9 400032
7.3%
8 387266
7.1%
Other values (2) 247146
 
4.5%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 5480902
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
3 797498
14.6%
. 689310
12.6%
6 653489
11.9%
4 501904
9.2%
2 472513
8.6%
1 456827
8.3%
5 453390
8.3%
7 421527
7.7%
9 400032
7.3%
8 387266
7.1%
Other values (2) 247146
 
4.5%
Distinct170
Distinct (%)0.1%
Missing2619486
Missing (%)90.6%
Memory size22.1 MiB
2025-03-14T13:57:07.108764image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length6
Median length4
Mean length3.963309728
Min length1

Characters and Unicode

Total characters1079019
Distinct characters10
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique50 ?
Unique (%)< 0.1%

Sample

1st row1000
2nd row1000
3rd row1000
4th row10000
5th row1000
ValueCountFrequency (%)
1000 227896
83.7%
5000 16175
 
5.9%
2500 9908
 
3.6%
30 5354
 
2.0%
500 3419
 
1.3%
20000 3067
 
1.1%
250 1603
 
0.6%
100 731
 
0.3%
10000 631
 
0.2%
3000 279
 
0.1%
Other values (160) 3189
 
1.2%
2025-03-14T13:57:07.192172image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 792899
73.5%
1 230067
 
21.3%
5 32027
 
3.0%
2 15324
 
1.4%
3 6199
 
0.6%
7 919
 
0.1%
8 587
 
0.1%
6 391
 
< 0.1%
4 307
 
< 0.1%
9 299
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
(unknown) 1079019
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
0 792899
73.5%
1 230067
 
21.3%
5 32027
 
3.0%
2 15324
 
1.4%
3 6199
 
0.6%
7 919
 
0.1%
8 587
 
0.1%
6 391
 
< 0.1%
4 307
 
< 0.1%
9 299
 
< 0.1%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 1079019
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
0 792899
73.5%
1 230067
 
21.3%
5 32027
 
3.0%
2 15324
 
1.4%
3 6199
 
0.6%
7 919
 
0.1%
8 587
 
0.1%
6 391
 
< 0.1%
4 307
 
< 0.1%
9 299
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 1079019
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
0 792899
73.5%
1 230067
 
21.3%
5 32027
 
3.0%
2 15324
 
1.4%
3 6199
 
0.6%
7 919
 
0.1%
8 587
 
0.1%
6 391
 
< 0.1%
4 307
 
< 0.1%
9 299
 
< 0.1%

typeStatus
Text

Missing 

Distinct48040
Distinct (%)74.7%
Missing2827400
Missing (%)97.8%
Memory size22.1 MiB
2025-03-14T13:57:07.233499image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length155
Median length119
Mean length42.25187292
Min length18

Characters and Unicode

Total characters2718401
Distinct characters97
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique37576 ?
Unique (%)58.4%

Sample

1st rowType of Monotes carrissoanus Bancr.
2nd rowType of Monotes hutchinsonianus Exell
3rd rowType of Clerodendrum swynnertonii S.Moore
4th rowType of Elatostema welwitschii var. cameroonense Rendle
5th rowType of Salacia pallescens Oliv.
ValueCountFrequency (%)
of 64889
 
17.1%
type 24786
 
6.5%
isotype 18858
 
5.0%
de 8270
 
2.2%
wild 7872
 
2.1%
var 7795
 
2.0%
7736
 
2.0%
syntype 7182
 
1.9%
holotype 6442
 
1.7%
ex 4574
 
1.2%
Other values (31175) 222024
58.4%
2025-03-14T13:57:07.358155image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
316090
 
11.6%
e 218798
 
8.0%
o 210474
 
7.7%
a 193266
 
7.1%
i 170957
 
6.3%
s 129337
 
4.8%
t 122484
 
4.5%
r 118864
 
4.4%
l 115945
 
4.3%
n 111273
 
4.1%
Other values (87) 1010913
37.2%

Most occurring categories

ValueCountFrequency (%)
(unknown) 2718401
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
316090
 
11.6%
e 218798
 
8.0%
o 210474
 
7.7%
a 193266
 
7.1%
i 170957
 
6.3%
s 129337
 
4.8%
t 122484
 
4.5%
r 118864
 
4.4%
l 115945
 
4.3%
n 111273
 
4.1%
Other values (87) 1010913
37.2%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 2718401
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
316090
 
11.6%
e 218798
 
8.0%
o 210474
 
7.7%
a 193266
 
7.1%
i 170957
 
6.3%
s 129337
 
4.8%
t 122484
 
4.5%
r 118864
 
4.4%
l 115945
 
4.3%
n 111273
 
4.1%
Other values (87) 1010913
37.2%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 2718401
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
316090
 
11.6%
e 218798
 
8.0%
o 210474
 
7.7%
a 193266
 
7.1%
i 170957
 
6.3%
s 129337
 
4.8%
t 122484
 
4.5%
r 118864
 
4.4%
l 115945
 
4.3%
n 111273
 
4.1%
Other values (87) 1010913
37.2%
Distinct233549
Distinct (%)8.1%
Missing0
Missing (%)0.0%
Memory size22.1 MiB
2025-03-14T13:57:07.526035image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length158
Median length107
Mean length29.25188866
Min length5

Characters and Unicode

Total characters84588798
Distinct characters166
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique92609 ?
Unique (%)3.2%

Sample

1st rowGalium verum L.
2nd rowLotononis bainesii Baker
3rd rowGalium verum L.
4th rowLotononis carinata Benth.
5th rowGalium verum L.
ValueCountFrequency (%)
l 706460
 
6.5%
261140
 
2.4%
ex 199848
 
1.9%
sp 157459
 
1.5%
subsp 102763
 
1.0%
var 95917
 
0.9%
dc 64385
 
0.6%
indet 58031
 
0.5%
de 49292
 
0.5%
benth 45623
 
0.4%
Other values (92452) 9046273
83.9%
2025-03-14T13:57:07.782776image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
7895433
 
9.3%
a 7565123
 
8.9%
i 6139918
 
7.3%
e 5425912
 
6.4%
r 4652030
 
5.5%
s 4349345
 
5.1%
l 4075004
 
4.8%
. 3947582
 
4.7%
o 3863084
 
4.6%
n 3801126
 
4.5%
Other values (156) 32874241
38.9%

Most occurring categories

ValueCountFrequency (%)
(unknown) 84588798
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
7895433
 
9.3%
a 7565123
 
8.9%
i 6139918
 
7.3%
e 5425912
 
6.4%
r 4652030
 
5.5%
s 4349345
 
5.1%
l 4075004
 
4.8%
. 3947582
 
4.7%
o 3863084
 
4.6%
n 3801126
 
4.5%
Other values (156) 32874241
38.9%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 84588798
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
7895433
 
9.3%
a 7565123
 
8.9%
i 6139918
 
7.3%
e 5425912
 
6.4%
r 4652030
 
5.5%
s 4349345
 
5.1%
l 4075004
 
4.8%
. 3947582
 
4.7%
o 3863084
 
4.6%
n 3801126
 
4.5%
Other values (156) 32874241
38.9%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 84588798
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
7895433
 
9.3%
a 7565123
 
8.9%
i 6139918
 
7.3%
e 5425912
 
6.4%
r 4652030
 
5.5%
s 4349345
 
5.1%
l 4075004
 
4.8%
. 3947582
 
4.7%
o 3863084
 
4.6%
n 3801126
 
4.5%
Other values (156) 32874241
38.9%

acceptedNameUsage
Text

Missing 

Distinct15519
Distinct (%)7.4%
Missing2680777
Missing (%)92.7%
Memory size22.1 MiB
2025-03-14T13:57:07.927385image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length129
Median length83
Mean length35.0914861
Min length12

Characters and Unicode

Total characters7402935
Distinct characters105
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique5006 ?
Unique (%)2.4%

Sample

1st rowListia bainesii (Baker) B.-E.van Wyk & Boatwr.
2nd rowMaerua duchesnei (De Wild.) F.White
3rd rowLeobordea eriantha (Benth.) B.-E.van Wyk & Boatwr.
4th rowGalium verum L.
5th rowGalium verum subsp. wirtgenii (F.W.Schultz) Oborny
ValueCountFrequency (%)
l 52877
 
5.7%
38198
 
4.1%
subsp 19613
 
2.1%
ex 12011
 
1.3%
var 10145
 
1.1%
pers 6611
 
0.7%
dc 6106
 
0.7%
fr 5692
 
0.6%
persicaria 3392
 
0.4%
de 3169
 
0.3%
Other values (18900) 772445
83.0%
2025-03-14T13:57:08.141420image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
719298
 
9.7%
a 621947
 
8.4%
i 495432
 
6.7%
e 465767
 
6.3%
r 413448
 
5.6%
s 388485
 
5.2%
. 354781
 
4.8%
l 345362
 
4.7%
o 343422
 
4.6%
n 320278
 
4.3%
Other values (95) 2934715
39.6%

Most occurring categories

ValueCountFrequency (%)
(unknown) 7402935
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
719298
 
9.7%
a 621947
 
8.4%
i 495432
 
6.7%
e 465767
 
6.3%
r 413448
 
5.6%
s 388485
 
5.2%
. 354781
 
4.8%
l 345362
 
4.7%
o 343422
 
4.6%
n 320278
 
4.3%
Other values (95) 2934715
39.6%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 7402935
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
719298
 
9.7%
a 621947
 
8.4%
i 495432
 
6.7%
e 465767
 
6.3%
r 413448
 
5.6%
s 388485
 
5.2%
. 354781
 
4.8%
l 345362
 
4.7%
o 343422
 
4.6%
n 320278
 
4.3%
Other values (95) 2934715
39.6%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 7402935
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
719298
 
9.7%
a 621947
 
8.4%
i 495432
 
6.7%
e 465767
 
6.3%
r 413448
 
5.6%
s 388485
 
5.2%
. 354781
 
4.8%
l 345362
 
4.7%
o 343422
 
4.6%
n 320278
 
4.3%
Other values (95) 2934715
39.6%

kingdom
Text

Missing 

Distinct6
Distinct (%)< 0.1%
Missing29119
Missing (%)1.0%
Memory size22.1 MiB
2025-03-14T13:57:08.176475image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length9
Median length7
Mean length6.839196903
Min length5

Characters and Unicode

Total characters19578015
Distinct characters20
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowPlantae
2nd rowPlantae
3rd rowPlantae
4th rowPlantae
5th rowPlantae
ValueCountFrequency (%)
plantae 2553085
89.2%
fungi 261803
 
9.1%
protozoa 27266
 
1.0%
chromista 15557
 
0.5%
bacteria 4150
 
0.1%
animalia 758
 
< 0.1%
2025-03-14T13:57:08.267717image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 5158809
26.4%
n 2815646
14.4%
t 2600058
13.3%
P 2580351
13.2%
e 2557235
13.1%
l 2553843
13.0%
i 283026
 
1.4%
F 261803
 
1.3%
u 261803
 
1.3%
g 261803
 
1.3%
Other values (10) 243638
 
1.2%

Most occurring categories

ValueCountFrequency (%)
(unknown) 19578015
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
a 5158809
26.4%
n 2815646
14.4%
t 2600058
13.3%
P 2580351
13.2%
e 2557235
13.1%
l 2553843
13.0%
i 283026
 
1.4%
F 261803
 
1.3%
u 261803
 
1.3%
g 261803
 
1.3%
Other values (10) 243638
 
1.2%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 19578015
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
a 5158809
26.4%
n 2815646
14.4%
t 2600058
13.3%
P 2580351
13.2%
e 2557235
13.1%
l 2553843
13.0%
i 283026
 
1.4%
F 261803
 
1.3%
u 261803
 
1.3%
g 261803
 
1.3%
Other values (10) 243638
 
1.2%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 19578015
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
a 5158809
26.4%
n 2815646
14.4%
t 2600058
13.3%
P 2580351
13.2%
e 2557235
13.1%
l 2553843
13.0%
i 283026
 
1.4%
F 261803
 
1.3%
u 261803
 
1.3%
g 261803
 
1.3%
Other values (10) 243638
 
1.2%

phylum
Text

Missing 

Distinct32
Distinct (%)< 0.1%
Missing29240
Missing (%)1.0%
Memory size22.1 MiB
2025-03-14T13:57:08.296699image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length19
Median length12
Mean length11.83263709
Min length7

Characters and Unicode

Total characters33870900
Distinct characters32
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique4 ?
Unique (%)< 0.1%

Sample

1st rowTracheophyta
2nd rowTracheophyta
3rd rowTracheophyta
4th rowTracheophyta
5th rowTracheophyta
ValueCountFrequency (%)
tracheophyta 2448306
85.5%
ascomycota 163254
 
5.7%
basidiomycota 98047
 
3.4%
bryophyta 44751
 
1.6%
mycetozoa 27227
 
1.0%
rhodophyta 23166
 
0.8%
marchantiophyta 20699
 
0.7%
ochrophyta 13403
 
0.5%
chlorophyta 12637
 
0.4%
cyanobacteria 4137
 
0.1%
Other values (22) 6871
 
0.2%
2025-03-14T13:57:08.382874image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 5462503
16.1%
h 5088438
15.0%
o 3189782
9.4%
c 2941520
8.7%
y 2906737
8.6%
t 2883523
8.5%
p 2566493
7.6%
r 2548647
7.5%
e 2480010
7.3%
T 2448306
7.2%
Other values (22) 1354941
 
4.0%

Most occurring categories

ValueCountFrequency (%)
(unknown) 33870900
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
a 5462503
16.1%
h 5088438
15.0%
o 3189782
9.4%
c 2941520
8.7%
y 2906737
8.6%
t 2883523
8.5%
p 2566493
7.6%
r 2548647
7.5%
e 2480010
7.3%
T 2448306
7.2%
Other values (22) 1354941
 
4.0%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 33870900
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
a 5462503
16.1%
h 5088438
15.0%
o 3189782
9.4%
c 2941520
8.7%
y 2906737
8.6%
t 2883523
8.5%
p 2566493
7.6%
r 2548647
7.5%
e 2480010
7.3%
T 2448306
7.2%
Other values (22) 1354941
 
4.0%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 33870900
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
a 5462503
16.1%
h 5088438
15.0%
o 3189782
9.4%
c 2941520
8.7%
y 2906737
8.6%
t 2883523
8.5%
p 2566493
7.6%
r 2548647
7.5%
e 2480010
7.3%
T 2448306
7.2%
Other values (22) 1354941
 
4.0%

class
Text

Missing 

Distinct91
Distinct (%)< 0.1%
Missing29606
Missing (%)1.0%
Memory size22.1 MiB
2025-03-14T13:57:08.412821image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length20
Median length13
Mean length12.67111405
Min length4

Characters and Unicode

Total characters36266401
Distinct characters44
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique10 ?
Unique (%)< 0.1%

Sample

1st rowMagnoliopsida
2nd rowMagnoliopsida
3rd rowMagnoliopsida
4th rowMagnoliopsida
5th rowMagnoliopsida
ValueCountFrequency (%)
magnoliopsida 1883175
65.8%
liliopsida 441571
 
15.4%
polypodiopsida 103000
 
3.6%
lecanoromycetes 90819
 
3.2%
agaricomycetes 74996
 
2.6%
bryopsida 40413
 
1.4%
myxomycetes 26950
 
0.9%
florideophyceae 22684
 
0.8%
dothideomycetes 19617
 
0.7%
jungermanniopsida 18975
 
0.7%
Other values (81) 139932
 
4.9%
2025-03-14T13:57:08.509148image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
i 5652135
15.6%
o 5163991
14.2%
a 4697882
13.0%
s 2810613
7.7%
d 2689281
7.4%
p 2685079
7.4%
l 2475862
6.8%
n 2085475
 
5.8%
g 1985161
 
5.5%
M 1912229
 
5.3%
Other values (34) 4108693
11.3%

Most occurring categories

ValueCountFrequency (%)
(unknown) 36266401
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
i 5652135
15.6%
o 5163991
14.2%
a 4697882
13.0%
s 2810613
7.7%
d 2689281
7.4%
p 2685079
7.4%
l 2475862
6.8%
n 2085475
 
5.8%
g 1985161
 
5.5%
M 1912229
 
5.3%
Other values (34) 4108693
11.3%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 36266401
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
i 5652135
15.6%
o 5163991
14.2%
a 4697882
13.0%
s 2810613
7.7%
d 2689281
7.4%
p 2685079
7.4%
l 2475862
6.8%
n 2085475
 
5.8%
g 1985161
 
5.5%
M 1912229
 
5.3%
Other values (34) 4108693
11.3%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 36266401
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
i 5652135
15.6%
o 5163991
14.2%
a 4697882
13.0%
s 2810613
7.7%
d 2689281
7.4%
p 2685079
7.4%
l 2475862
6.8%
n 2085475
 
5.8%
g 1985161
 
5.5%
M 1912229
 
5.3%
Other values (34) 4108693
11.3%

order
Text

Missing 

Distinct420
Distinct (%)< 0.1%
Missing29654
Missing (%)1.0%
Memory size22.1 MiB
2025-03-14T13:57:08.624769image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length18
Median length15
Mean length9.520236653
Min length6

Characters and Unicode

Total characters27247717
Distinct characters49
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique31 ?
Unique (%)< 0.1%

Sample

1st rowGentianales
2nd rowFabales
3rd rowGentianales
4th rowFabales
5th rowGentianales
ValueCountFrequency (%)
poales 293986
 
10.3%
fabales 223945
 
7.8%
asterales 222843
 
7.8%
lamiales 207299
 
7.2%
gentianales 193419
 
6.8%
malpighiales 152625
 
5.3%
rosales 120272
 
4.2%
caryophyllales 119402
 
4.2%
asparagales 85682
 
3.0%
polypodiales 82220
 
2.9%
Other values (410) 1160391
40.5%
2025-03-14T13:57:08.826898image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 4663506
17.1%
l 3770986
13.8%
e 3591795
13.2%
s 3577301
13.1%
i 1568995
 
5.8%
r 1067380
 
3.9%
o 1061493
 
3.9%
n 871863
 
3.2%
t 712861
 
2.6%
p 656978
 
2.4%
Other values (39) 5704559
20.9%

Most occurring categories

ValueCountFrequency (%)
(unknown) 27247717
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
a 4663506
17.1%
l 3770986
13.8%
e 3591795
13.2%
s 3577301
13.1%
i 1568995
 
5.8%
r 1067380
 
3.9%
o 1061493
 
3.9%
n 871863
 
3.2%
t 712861
 
2.6%
p 656978
 
2.4%
Other values (39) 5704559
20.9%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 27247717
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
a 4663506
17.1%
l 3770986
13.8%
e 3591795
13.2%
s 3577301
13.1%
i 1568995
 
5.8%
r 1067380
 
3.9%
o 1061493
 
3.9%
n 871863
 
3.2%
t 712861
 
2.6%
p 656978
 
2.4%
Other values (39) 5704559
20.9%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 27247717
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
a 4663506
17.1%
l 3770986
13.8%
e 3591795
13.2%
s 3577301
13.1%
i 1568995
 
5.8%
r 1067380
 
3.9%
o 1061493
 
3.9%
n 871863
 
3.2%
t 712861
 
2.6%
p 656978
 
2.4%
Other values (39) 5704559
20.9%

family
Text

Distinct1546
Distinct (%)0.1%
Missing28414
Missing (%)1.0%
Memory size22.1 MiB
2025-03-14T13:57:08.966507image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length21
Median length18
Mean length10.79378024
Min length6

Characters and Unicode

Total characters30906090
Distinct characters55
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique126 ?
Unique (%)< 0.1%

Sample

1st rowRubiaceae
2nd rowFabaceae
3rd rowRubiaceae
4th rowFabaceae
5th rowRubiaceae
ValueCountFrequency (%)
fabaceae 210718
 
7.4%
asteraceae 203450
 
7.1%
poaceae 181701
 
6.3%
rubiaceae 137140
 
4.8%
cyperaceae 88397
 
3.1%
rosaceae 86224
 
3.0%
lamiaceae 82656
 
2.9%
brassicaceae 53902
 
1.9%
caryophyllaceae 49173
 
1.7%
orchidaceae 47318
 
1.7%
Other values (1536) 1722645
60.2%
2025-03-14T13:57:09.188443image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 7243148
23.4%
e 6604620
21.4%
c 3451551
11.2%
i 1391511
 
4.5%
r 1290763
 
4.2%
o 1139048
 
3.7%
l 947896
 
3.1%
n 879889
 
2.8%
t 791464
 
2.6%
s 694400
 
2.2%
Other values (45) 6471800
20.9%

Most occurring categories

ValueCountFrequency (%)
(unknown) 30906090
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
a 7243148
23.4%
e 6604620
21.4%
c 3451551
11.2%
i 1391511
 
4.5%
r 1290763
 
4.2%
o 1139048
 
3.7%
l 947896
 
3.1%
n 879889
 
2.8%
t 791464
 
2.6%
s 694400
 
2.2%
Other values (45) 6471800
20.9%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 30906090
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
a 7243148
23.4%
e 6604620
21.4%
c 3451551
11.2%
i 1391511
 
4.5%
r 1290763
 
4.2%
o 1139048
 
3.7%
l 947896
 
3.1%
n 879889
 
2.8%
t 791464
 
2.6%
s 694400
 
2.2%
Other values (45) 6471800
20.9%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 30906090
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
a 7243148
23.4%
e 6604620
21.4%
c 3451551
11.2%
i 1391511
 
4.5%
r 1290763
 
4.2%
o 1139048
 
3.7%
l 947896
 
3.1%
n 879889
 
2.8%
t 791464
 
2.6%
s 694400
 
2.2%
Other values (45) 6471800
20.9%

genus
Text

Missing 

Distinct18030
Distinct (%)0.6%
Missing58006
Missing (%)2.0%
Memory size22.1 MiB
2025-03-14T13:57:09.344686image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length21
Median length18
Mean length8.540898716
Min length2

Characters and Unicode

Total characters24202618
Distinct characters58
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2990 ?
Unique (%)0.1%

Sample

1st rowGalium
2nd rowLotononis
3rd rowGalium
4th rowLotononis
5th rowGalium
ValueCountFrequency (%)
carex 34049
 
1.2%
rubus 33062
 
1.2%
cyperus 18235
 
0.6%
ranunculus 17402
 
0.6%
rosa 17300
 
0.6%
asplenium 15293
 
0.5%
hieracium 14418
 
0.5%
euphorbia 14265
 
0.5%
cladonia 14203
 
0.5%
psychotria 13506
 
0.5%
Other values (18006) 2641999
93.2%
2025-03-14T13:57:09.566438image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 2967968
 
12.3%
i 2208019
 
9.1%
e 1680374
 
6.9%
r 1619592
 
6.7%
o 1564259
 
6.5%
u 1376822
 
5.7%
s 1349174
 
5.6%
l 1298927
 
5.4%
n 1211229
 
5.0%
t 997676
 
4.1%
Other values (48) 7928578
32.8%

Most occurring categories

ValueCountFrequency (%)
(unknown) 24202618
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
a 2967968
 
12.3%
i 2208019
 
9.1%
e 1680374
 
6.9%
r 1619592
 
6.7%
o 1564259
 
6.5%
u 1376822
 
5.7%
s 1349174
 
5.6%
l 1298927
 
5.4%
n 1211229
 
5.0%
t 997676
 
4.1%
Other values (48) 7928578
32.8%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 24202618
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
a 2967968
 
12.3%
i 2208019
 
9.1%
e 1680374
 
6.9%
r 1619592
 
6.7%
o 1564259
 
6.5%
u 1376822
 
5.7%
s 1349174
 
5.6%
l 1298927
 
5.4%
n 1211229
 
5.0%
t 997676
 
4.1%
Other values (48) 7928578
32.8%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 24202618
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
a 2967968
 
12.3%
i 2208019
 
9.1%
e 1680374
 
6.9%
r 1619592
 
6.7%
o 1564259
 
6.5%
u 1376822
 
5.7%
s 1349174
 
5.6%
l 1298927
 
5.4%
n 1211229
 
5.0%
t 997676
 
4.1%
Other values (48) 7928578
32.8%
Distinct56222
Distinct (%)1.9%
Missing7633
Missing (%)0.3%
Memory size22.1 MiB
2025-03-14T13:57:09.737399image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length31
Median length24
Mean length8.611555058
Min length1

Characters and Unicode

Total characters24836629
Distinct characters61
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique15989 ?
Unique (%)0.6%

Sample

1st rowverum
2nd rowbainesii
3rd rowverum
4th rowcarinata
5th rowverum
ValueCountFrequency (%)
sp 182985
 
6.3%
indet 24318
 
0.8%
vulgaris 13153
 
0.5%
arvensis 12306
 
0.4%
africana 11456
 
0.4%
palustris 9392
 
0.3%
officinalis 8454
 
0.3%
repens 7389
 
0.3%
abyssinica 7280
 
0.3%
alpina 7045
 
0.2%
Other values (56146) 2600349
90.2%
2025-03-14T13:57:09.964115image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 3205922
12.9%
i 2818120
11.3%
s 2033854
 
8.2%
e 1786581
 
7.2%
r 1614305
 
6.5%
l 1544409
 
6.2%
n 1523857
 
6.1%
u 1465535
 
5.9%
o 1357514
 
5.5%
t 1281133
 
5.2%
Other values (51) 6205399
25.0%

Most occurring categories

ValueCountFrequency (%)
(unknown) 24836629
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
a 3205922
12.9%
i 2818120
11.3%
s 2033854
 
8.2%
e 1786581
 
7.2%
r 1614305
 
6.5%
l 1544409
 
6.2%
n 1523857
 
6.1%
u 1465535
 
5.9%
o 1357514
 
5.5%
t 1281133
 
5.2%
Other values (51) 6205399
25.0%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 24836629
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
a 3205922
12.9%
i 2818120
11.3%
s 2033854
 
8.2%
e 1786581
 
7.2%
r 1614305
 
6.5%
l 1544409
 
6.2%
n 1523857
 
6.1%
u 1465535
 
5.9%
o 1357514
 
5.5%
t 1281133
 
5.2%
Other values (51) 6205399
25.0%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 24836629
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
a 3205922
12.9%
i 2818120
11.3%
s 2033854
 
8.2%
e 1786581
 
7.2%
r 1614305
 
6.5%
l 1544409
 
6.2%
n 1523857
 
6.1%
u 1465535
 
5.9%
o 1357514
 
5.5%
t 1281133
 
5.2%
Other values (51) 6205399
25.0%

nomenclaturalCode
Text

Constant 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size22.1 MiB
2025-03-14T13:57:10.004949image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length4
Median length4
Mean length4
Min length4

Characters and Unicode

Total characters11566952
Distinct characters4
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowICBN
2nd rowICBN
3rd rowICBN
4th rowICBN
5th rowICBN
ValueCountFrequency (%)
icbn 2891738
100.0%
2025-03-14T13:57:10.084241image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
I 2891738
25.0%
C 2891738
25.0%
B 2891738
25.0%
N 2891738
25.0%

Most occurring categories

ValueCountFrequency (%)
(unknown) 11566952
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
I 2891738
25.0%
C 2891738
25.0%
B 2891738
25.0%
N 2891738
25.0%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 11566952
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
I 2891738
25.0%
C 2891738
25.0%
B 2891738
25.0%
N 2891738
25.0%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 11566952
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
I 2891738
25.0%
C 2891738
25.0%
B 2891738
25.0%
N 2891738
25.0%

taxonomicStatus
Text

Missing 

Distinct13
Distinct (%)< 0.1%
Missing320988
Missing (%)11.1%
Memory size22.1 MiB
2025-03-14T13:57:10.114450image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length22
Median length13
Mean length12.8684629
Min length7

Characters and Unicode

Total characters33081601
Distinct characters23
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowaccepted name
2nd rowaccepted name
3rd rowunchecked name
4th rowaccepted name
5th rowaccepted name
ValueCountFrequency (%)
name 2436929
48.5%
accepted 2069017
41.2%
unchecked 363316
 
7.2%
synonym 107045
 
2.1%
tentative 17381
 
0.3%
orthog 4903
 
0.1%
variant 4903
 
0.1%
invalid 4588
 
0.1%
later 1962
 
< 0.1%
homonym 1962
 
< 0.1%
Other values (11) 8245
 
0.2%
2025-03-14T13:57:10.204222image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
e 7339412
22.2%
c 4864671
14.7%
a 4539772
13.7%
n 3050775
9.2%
m 2549881
 
7.7%
2449501
 
7.4%
d 2439454
 
7.4%
t 2136099
 
6.5%
p 2069193
 
6.3%
h 370181
 
1.1%
Other values (13) 1272662
 
3.8%

Most occurring categories

ValueCountFrequency (%)
(unknown) 33081601
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
e 7339412
22.2%
c 4864671
14.7%
a 4539772
13.7%
n 3050775
9.2%
m 2549881
 
7.7%
2449501
 
7.4%
d 2439454
 
7.4%
t 2136099
 
6.5%
p 2069193
 
6.3%
h 370181
 
1.1%
Other values (13) 1272662
 
3.8%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 33081601
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
e 7339412
22.2%
c 4864671
14.7%
a 4539772
13.7%
n 3050775
9.2%
m 2549881
 
7.7%
2449501
 
7.4%
d 2439454
 
7.4%
t 2136099
 
6.5%
p 2069193
 
6.3%
h 370181
 
1.1%
Other values (13) 1272662
 
3.8%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 33081601
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
e 7339412
22.2%
c 4864671
14.7%
a 4539772
13.7%
n 3050775
9.2%
m 2549881
 
7.7%
2449501
 
7.4%
d 2439454
 
7.4%
t 2136099
 
6.5%
p 2069193
 
6.3%
h 370181
 
1.1%
Other values (13) 1272662
 
3.8%